7 research outputs found

    The LDBC Graphalytics Benchmark

    Full text link
    In this document, we describe LDBC Graphalytics, an industrial-grade benchmark for graph analysis platforms. The main goal of Graphalytics is to enable the fair and objective comparison of graph analysis platforms. Due to the diversity of bottlenecks and performance issues such platforms need to address, Graphalytics consists of a set of selected deterministic algorithms for full-graph analysis, standard graph datasets, synthetic dataset generators, and reference output for validation purposes. Its test harness produces deep metrics that quantify multiple kinds of systems scalability, weak and strong, and robustness, such as failures and performance variability. The benchmark also balances comprehensiveness with runtime necessary to obtain the deep metrics. The benchmark comes with open-source software for generating performance data, for validating algorithm results, for monitoring and sharing performance data, and for obtaining the final benchmark result as a standard performance report

    Graphless:Toward serverless graph processing

    No full text

    Exploring HPC and Big Data Convergence: A Graph Processing Study on Intel Knights Landing

    No full text
    The question 'Can big data and HPC infrastructure converge?' has important implications for many operators and clients of modern computing. However, answering it is challenging. The hardware is currently different, and fast evolving: big data uses machines with modest numbers of fat cores per socket, large caches, and much memory, whereas HPC uses machines with larger numbers of (thinner) cores, non-trivial NUMA architectures, and fast interconnects. In this work, we investigate the convergence of big data and HPC infrastructure for one of the most challenging application domains, the highly irregular graph processing. We contrast through a systematic, experimental study of over 300,000 core-hours the performance of a modern multicore, Intel Knights Landing (KNL) and of traditional big data hardware, in processing representative graph workloads using state-of-the-art graph analytics platforms. The experimental results indicate KNL is convergence-ready, performance-wise, but only after extensive and expert-level tuning of software and hardware parameters

    Exploring HPC and Big Data Convergence: A Graph Processing Study on Intel Knights Landing

    No full text
    The question 'Can big data and HPC infrastructure converge?' has important implications for many operators and clients of modern computing. However, answering it is challenging. The hardware is currently different, and fast evolving: big data uses machines with modest numbers of fat cores per socket, large caches, and much memory, whereas HPC uses machines with larger numbers of (thinner) cores, non-trivial NUMA architectures, and fast interconnects. In this work, we investigate the convergence of big data and HPC infrastructure for one of the most challenging application domains, the highly irregular graph processing. We contrast through a systematic, experimental study of over 300,000 core-hours the performance of a modern multicore, Intel Knights Landing (KNL) and of traditional big data hardware, in processing representative graph workloads using state-of-the-art graph analytics platforms. The experimental results indicate KNL is convergence-ready, performance-wise, but only after extensive and expert-level tuning of software and hardware parameters.</p

    Exploring HPC and Big Data Convergence: A Graph Processing Study on Intel Knights Landing

    No full text
    The question 'Can big data and HPC infrastructure converge?' has important implications for many operators and clients of modern computing. However, answering it is challenging. The hardware is currently different, and fast evolving: big data uses machines with modest numbers of fat cores per socket, large caches, and much memory, whereas HPC uses machines with larger numbers of (thinner) cores, non-trivial NUMA architectures, and fast interconnects. In this work, we investigate the convergence of big data and HPC infrastructure for one of the most challenging application domains, the highly irregular graph processing. We contrast through a systematic, experimental study of over 300,000 core-hours the performance of a modern multicore, Intel Knights Landing (KNL) and of traditional big data hardware, in processing representative graph workloads using state-of-the-art graph analytics platforms. The experimental results indicate KNL is convergence-ready, performance-wise, but only after extensive and expert-level tuning of software and hardware parameters.Accepted author manuscriptDistributed System

    Graphless: Toward serverless graph processing

    No full text
    Our society is increasingly solving complex problems through the use of graph processing. Existing graph processing systems focus on performance, which allows addressing ever-larger and more complex problems. They also require uncommon expertise to properly deploy and utilize. To make graph processing generally accessible-to small and medium enterprises and institutions, to common research groups, to individuals-, in this work we design and implement the Graphless graph-processing system. Graphless is based on the serverless paradigm, which proposes to simplify computing by letting developers only focus on small, stateless functions, which are deployed and managed automatically. We address with Graphless the key challenge of combining the stateless functions assumed by serverless computing with the (opposite) data-intensive nature of graph processing. Graphless tackles this challenge through an architectural approach that allows it to deploy with push or with pull operation, and a collection of backend services, such as an orchestrator and a memory-as-a-service component. We implement Graphless and conduct with it real-world experiments using Amazon Lambda for cloud-based serverless resources. Using the LDBC Graphalytics benchmark, we analyze Graphless, and compare its performance and operational cost with the graph-processing systems Apache Giraph (big data domain) and GraphMat (HPC). Overall, we show evidence Graphless provides performance and cost-efficiency similar to Giraph, for algorithms that can benefit from fine-grained elasticity, and lower than GraphMat, but is architecturally easier to deploy, and provides both push and pull operation

    The atlarge vision on the design of distributed systems and ecosystems

    No full text
    High-quality designs of distributed systems and services are essential for our digital economy and society. Threatening to slow down the stream of working designs, we identify the mounting pressure of scale and complexity of (eco-)systems, of ill-defined and wicked problems, and of unclear processes, methods, and tools. We envision design itself as a core research topic in distributed systems, to understand and improve the science and practice of distributed (eco-)system design. Toward this vision, we propose the AtLarge design framework, accompanied by a set of 8 core design principles. We also propose 10 key challenges, which we hope the community can address in the following 5 years. In our experience so far, the proposed framework and principles are practical, and lead to pragmatic and innovative designs for large-scale distributed systems
    corecore